This paper proposes a novel approach to person re-identification, afundamental task in distributed multi-camera surveillance systems. Although avariety of powerful algorithms have been presented in the past few years, mostof them usually focus on designing hand-crafted features and learning metricseither individually or sequentially. Different from previous works, weformulate a unified deep ranking framework that jointly tackles both of thesekey components to maximize their strengths. We start from the principle thatthe correct match of the probe image should be positioned in the top rankwithin the whole gallery set. An effective learning-to-rank algorithm isproposed to minimize the cost corresponding to the ranking disorders of thegallery. The ranking model is solved with a deep convolutional neural network(CNN) that builds the relation between input image pairs and their similarityscores through joint representation learning directly from raw image pixels.The proposed framework allows us to get rid of feature engineering and does notrely on any assumption. An extensive comparative evaluation is given,demonstrating that our approach significantly outperforms all state-of-the-artapproaches, including both traditional and CNN-based methods on the challengingVIPeR, CUHK-01 and CAVIAR4REID datasets. Additionally, our approach has betterability to generalize across datasets without fine-tuning.
展开▼